Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency

نویسندگان

چکیده

Sample efficiency is crucial for imitation learning methods to be applicable in real-world applications. Many studies improve sample by extending adversarial off-policy regardless of the fact that these extensions could either change original objective or involve complicated optimization. We revisit foundation and propose an efficient approach requires no training min-max Our formulation capitalizes on two key insights: (1) similarity between Bellman equation stationary state-action distribution allows us derive a novel temporal difference (TD) approach; (2) use deterministic policy simplifies TD learning. Combined, insights yield practical algorithm, Deterministic Discriminative Imitation (D2-Imitation), which oper- ates first partitioning samples into replay buffers then via reinforcement empirical results show D2-Imitation effective achieving good efficiency, outperforming several extension approaches many control tasks.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generative Adversarial Imitation Learning

Consider learning a policy from example expert behavior, without interaction with the expert or access to reinforcement signal. One approach is to recover the expert’s cost function with inverse reinforcement learning, then extract a policy from that cost function with reinforcement learning. This approach is indirect and can be slow. We propose a new general framework for directly extracting a...

متن کامل

Social Cognition: Imitation, Imitation, Imitation

Monkeys recognize when they are being imitated, but they seem unable to learn by imitation. These facts make sense if imitation is seen as two different capacities: social mirroring, when actions are matched and have social benefits; and learning by copying, when new behavioural routines are acquired by observation.

متن کامل

Model-based Adversarial Imitation Learning

Generative adversarial learning is a popular new approach to training generative models which has been proven successful for other related problems as well. The general idea is to maintain an oracle D that discriminates between the expert’s data distribution and that of the generative model G. The generative model is trained to capture the expert’s distribution by maximizing the probability of ...

متن کامل

Multimodal Storytelling via Generative Adversarial Imitation Learning

Deriving event storylines is an effective summarization method to succinctly organize extensive information, which can significantly alleviate the pain of information overload. The critical challenge is the lack of widely recognized definition of storyline metric. Prior studies have developed various approaches based on different assumptions about users’ interests. These works can extract inter...

متن کامل

Multi-agent Generative Adversarial Imitation Learning

We propose a new framework for multi-agent imitation learning for general Markov games, where we build upon a generalized notion of inverse reinforcement learning. We introduce a practical multi-agent actor-critic algorithm with good empirical performance. Our method can be used to imitate complex behaviors in highdimensional environments with multiple cooperative or competitive agents. 1 MARKO...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i8.20813